{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "8b79050e",
   "metadata": {},
   "source": [
    "\n",
    "# Clinical Study Designs and Statistical Power Analysis\n",
    "\n",
    "This notebook explores the basics of clinical study designs, including sample size calculations and statistical power analysis. Links to the resources are provided for practice.\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0a48423b",
   "metadata": {},
   "source": [
    "\n",
    "## Clinical Study Design Concepts\n",
    "\n",
    "### Types of Studies\n",
    "\n",
    "1. **Observational Studies:**\n",
    "   - Includes cohort, cross-sectional, and case-control studies.\n",
    "   - Observes outcomes without intervention.\n",
    "\n",
    "2. **Experimental Studies:**\n",
    "   - Includes randomized controlled trials (RCTs) and other interventional studies.\n",
    "   - Involves assigning interventions to study subjects.\n",
    "\n",
    "### Key Terms\n",
    "\n",
    "- **Exposure**: A factor that may influence an outcome.\n",
    "- **Outcome**: The result being studied, such as disease occurrence.\n",
    "- **Confounders**: Variables that may distort the true relationship between exposure and outcome.\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "420730ab",
   "metadata": {},
   "source": [
    "\n",
    "## Sample Size Calculation\n",
    "\n",
    "Calculate the required sample size for a study based on effect size, significance level, and desired power.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "000faba0",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "import statsmodels.stats.power as smp\n",
    "\n",
    "# Parameters for sample size calculation\n",
    "effect_size = 0.5  # Medium effect size (Cohen's d)\n",
    "alpha = 0.05       # Significance level\n",
    "power = 0.8        # Desired power\n",
    "\n",
    "# Calculate required sample size\n",
    "sample_size = smp.tt_solve_power(effect_size=effect_size, alpha=alpha, power=power, alternative='two-sided')\n",
    "print(f\"Required Sample Size: {round(sample_size)}\")\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f62594db",
   "metadata": {},
   "source": [
    "\n",
    "### Adjusting Parameters\n",
    "\n",
    "Observe the impact of smaller effect size and higher power on sample size requirements.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d4d44dfd",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# Adjust effect size\n",
    "effect_size = 0.2  # Small effect size\n",
    "\n",
    "# Recalculate sample size\n",
    "sample_size_small_effect = smp.tt_solve_power(effect_size=effect_size, alpha=alpha, power=power, alternative='two-sided')\n",
    "print(f\"Required Sample Size (Small Effect Size): {round(sample_size_small_effect)}\")\n",
    "\n",
    "# Increase power\n",
    "power = 0.95  # Higher power\n",
    "\n",
    "# Recalculate sample size\n",
    "sample_size_high_power = smp.tt_solve_power(effect_size=0.5, alpha=alpha, power=power, alternative='two-sided')\n",
    "print(f\"Required Sample Size (Higher Power): {round(sample_size_high_power)}\")\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e430611a",
   "metadata": {},
   "source": [
    "\n",
    "## Prevalence and Incidence Calculations\n",
    "\n",
    "Explore how to calculate prevalence and incidence from study data.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5b9d5d08",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# Example data\n",
    "population_size = 100000\n",
    "prevalent_cases = 5000\n",
    "new_cases = 500  # Cases over a time period\n",
    "\n",
    "# Calculate prevalence\n",
    "prevalence = (prevalent_cases / population_size) * 100  # as a percentage\n",
    "print(f\"Prevalence: {prevalence:.2f}%\")\n",
    "\n",
    "# Calculate incidence\n",
    "incidence = (new_cases / population_size) * 100000  # per 100,000 population\n",
    "print(f\"Incidence: {incidence:.2f} per 100,000 population\")\n",
    "        "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "025ab541",
   "metadata": {},
   "source": [
    "\n",
    "## Visualizing Study Data\n",
    "\n",
    "Use histograms and scatter plots to visualize key aspects of study data.\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "09f481cf",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "import pandas as pd\n",
    "\n",
    "# Example dataset\n",
    "study_data = pd.DataFrame({\n",
    "    \"Age\": [30, 35, 40, 45, 50, 55, 60, 65, 70],\n",
    "    \"Prevalence\": [5, 6, 7, 8, 9, 10, 12, 15, 18]\n",
    "})\n",
    "\n",
    "# Histogram of prevalence\n",
    "sns.histplot(study_data['Prevalence'], kde=True)\n",
    "plt.title(\"Prevalence Distribution\")\n",
    "plt.xlabel(\"Prevalence (%)\")\n",
    "plt.ylabel(\"Frequency\")\n",
    "plt.show()\n",
    "\n",
    "# Scatter plot of age vs prevalence\n",
    "sns.scatterplot(x=study_data['Age'], y=study_data['Prevalence'])\n",
    "plt.title(\"Age vs Prevalence\")\n",
    "plt.xlabel(\"Age\")\n",
    "plt.ylabel(\"Prevalence (%)\")\n",
    "plt.show()\n",
    "        "
   ]
  }
 ],
 "metadata": {},
 "nbformat": 4,
 "nbformat_minor": 5
}
